A dissection of the cauliflower mosaic 
virus polyadenylation signal 



ii 



Helcne Sanfacon, 1 Peter Brodmann, and Thomas Hohn 
Fncdiich Mieschcr Insutut. CH-4002, Basel, Switzerland 



Mutagenesis analysis of the polyadenylation [poly(A)] signal from the cauliflower mosaic virus (CaM V), a plant 
pararctrovirus, revealed striking differences to known vertebrate poly(A) signals. Our results show that (l)the 
AATAAA sequence is necessary for efficient cleavage at the poly(A)site, although the requirement for an 
authentic AATAAA might be less stringent in plant than in vertebrate cells; (2) surprisingly and in contrast to 
the majority of vertebrate poly(A) signals, the sequences downstream of the CaMV poIy(A) site do not influence 
processing efficiency drastically although they affect the precision of cleavage; and (3) deletion of sequences 
upstream of the CaMV AATAAA sequence decreased processing at the CaMV site dramatically, suggesting the 
presence of one or several positively acting upstream elements. An oligonucleotide consisting of CaMV 
upstream sequences could induce the recognition of a normally silent exogenous poly(A) signal when inserted 
upstream of its AATAAA motif. 

[Key Words: mRNA 3'-cnd formation; plant polyadenylation signal; cauliflower mosaic virus f AATAAA 
mutation? upstream -downstream deletions; RNase protection assay) 

Received lulv {Q 1990; revised version accepted October 15. 1990. 



The formation oi the mRNA 3' end has been studied ex- 
tensively in vertebrate cells (for recent reviews, see 
Humphreys and Proudfoot 1988; Manlcy 1988). The pre- 
mature transcript is cleaved by an endonuciease prior to 
transcription termination, and a poly(A) tail is added at 
the cleavage site by the poly(A) polymerase. The 
cleavage and the poly! A) addition reactions are tightly 
coupled in vivo and involve large polyadenylation com- 
plexes containing many additional factors (Humphrey 
et al. 198 7 . Zarkower and Wickcns 1987 ; Zhang and 
Cole 19H7 : Chnsioion and Keller 1988. 1989 ; Gilmartin 
et al. 1988: McOevut et al. 1988; Takagaki et al. 19881. 
Efficient transcription termination depends on the pres- 
ence or a tunctional polytAI site iWhuelaw and Proud- 
toot 1986; Logan et al. 1987; Connelly and Manlcy 1988; 
lanoix and Acheson 19881 and can occur up to several 
thousand nucleotides farther downstream jfor review, 
ice Proudfoot 19891 most likely at a polymerase II pause 
site iConnellv and Manley 19891. 

In vertebrate genes, the hexanuclcotidc AATAAA is 
almost always present 10-30 nucleotides upstream of 
the cleavage site | Proudfoot and Brownice 19761. Point 
mutations in this sequence greatly reduce the efficiency 
or 3'-cnd formation in vivo iFiczgcrald and Shenk 1981; 
Montelt et al. 1983: Wickens and Stephenson 19841 and 
oi both cleavage and polyadenylation in vitro iManley et 
al. 1985; Zarkowei et al. 1986; Conway and Wickens 
1987; Wilus= et al. 1989). In addition, a less well-con- 
served T-ricb or TG-rich clement, usually situated im- 

'Present address: Atriculiurc CanadtHReiearch Stacioo. Vancouver. B.C. - 
V6T 1X2 Canada. 



mediately downstream of the poly(Al addition site, is 
important for both reactions in vivo and in vitro (Gil and 
Proudfoot 1984; Coles and Stacy 1985; Conway and 
Wickens 1985; Hart et al. 1985, Mclauchlan et al. 1985; 
Sadoisky ct al. 1985; Sperry and Berget 1986). Both the 
AATAAA and the downstream element are required for 
formation of the specific cleavage and polyadenylation 
complexes |Skolnik-David et al. 1987; Gilmartin and 
Nevins 1989). Additional upstream sequences were 
sbown to improve recognition of the SV40 late polyjA) 
signal {Cars well and Alwine 1989| and the hepatitis B 
virus poiylAI signal (Russnak and G an em 1990) and to 
influence poly(A) site selection in the adenovirus late 
transcription unit lOeZasso and lm peri ale 1989). 

Much less is known about mRNA3'-end formation in 
plants. Polyadenylation signals from mammalian genes 
are not properly recognized in plants (Hunt ct ai. 1987), 
suggesting that either the sequences required for 3 '-end 
formation or the mechanism of 3' -end formation might 
differ in plants. Heterogeneity of plant mRNA 3' ends 
due to the presence of multiple polyjA) signals has been 
described {Dean ct al. 1986), contrasting with the single 
polytAI site present in most vertebrate genes. Further- 
more, a perfectly conserved AATAAA sequence is 
present in only 40% of known plant genes IJoshi 1987), 
suggesting that this sequence might not be absolutely 
required for 3*-end formation in plant cells. On a 
comouter survey, a outative consensus seouence. 
TGTGTTT, was proposed to be involved in 3'^eua for- 
mation. However, this sequence was found to be located 
in a widespread area downstream of the cleavage site 
jjoshi 1987). Very few functional analyses have been 



GENES & DEVELOPMENT S: 141—149 « 199! br CoW Sprint Harbor Ubooionr Press ISSN 0W*tt9/»l S1.00 



141 



Sjcukoo ci *l 



made using plane poIyjA) signals. Analysis of the potato 
wound- inducible proteinase inhibitor II gene (An et al. 
19891 and of the octopine synthetase gene {Ingelbrecht et 
ah 1989) showed reduction of reporter gene activity and 
of the steady-state RNA level on deletion of an AA- 
TAAA sequence and of a downstream TG-rich element. 
Deletion analysis of the pea ribulose- 1, 5-bi phosphate 
carboxylase small-subunit gene suggested that elements 
located both upstream and downstream of the cleavage 
site are involved in mRNA 3' end formation |Hunt and 
McDonald 19891. 

The CaMV, a plant pararerrovirus, produces two major 
transcripts that are polyadenyiated at the same site 
tCovey ci al. 198 1 ; Guilley et al. 1982). In a previous 
study, we showed that proximity to the promoter in- 
hibits recognition of this signal. However, very efficient 
cleavage at the CaMV processing site was directed by 
the sequences contained in a 195-nucleotide-long CaMV 
fragment when placed downstream of a reporter gene 
(Saniaqon and Hohn 1990). The primary structure of this 
fragment is shown in Figure 1. It spans 180 nucleotides 
upstream and 15 nucleotides downstream of the 
cleavage sue. Here, we present a mutational analysis of 
this signal Several elements involved in 3**end forma- 
tion were defined 



Results 

Analysis of the primary structure of a l9S-nucleotide 
CaMV fragment containing the polytA) signal 

A perfectly conserved AATAAA sequence is situated 13 
nucleotides upstream of the poly|A) addition site of 
CaMV RNA iFij». 11. Two TC-rich regions, resembling 
the TG-nch sequence found downstream of the polyjAI 
addition sue in vertebrate genes (Gil and Proudfoot 
1984.- Coles and Stacy 1985; Conway and Wickens 1985; 
Sadofsky et al. 1985) and possibly in plant genes (joshi 
19871, are round 70 and 124 nucleotides upstream of the 
processing site. A third TG-rich region is located 32 nu- 
cleotides upstream oi the processing site and consists of 
several tandem repeals ot the sequence TTTGTA (Fig. 
11. This sequence also resembles a sequence proposed to 
be involved m mRNA 3 '-end forma uon in yeast 
TAC . . TACT . . TTT; Zarct and Sherman 19821. 
Surprisingly, in contrast to vertebrate genes, the CaMV 
sequences downstream oi the processing site are AC- 
nch. 

An AATAAA sequence is required for processing at the 
CaMV sue 

The plasmid R-CAT ISaniacon and Hohn 1990; Fig. 2), 
containing the chloramphenicol aceryltransfcrasc (CAT1 
reporter gene between CaMV 35S promoter and polytA) 
signal, was used as a standard plasmid to study the 
CaMV polyiA) signal by site-directed mutagenesis. To 
rescue the RNA molecules that are not processed at the 
CaMV site. R-CAT contains a- second polytA) signal 
from the nopaltnc synthetase [nos) gene (Bevan et al. 



- icrwrj;^^tu7cccTCA^aCACCACTCT(T:vm:.^irr\Trrr'TTrr 

•i 

. ;.\ x r.\ a tctctc au tacttcccac at a xc cc - attac c-ttctt xtamm TTTC'J -TTV. 

K7 4 * V 

r c rr.T7fzi.ee atat.v iC aj*accct t\ct*jgt itttc~ * 1 1 ii. j •» u ; i -li ttt: •. ~ 
.auaX*ttTv~". ir':rrAA»At.CA*AArrr.\r..^ct^cc ;.i<c>i .* .itr.ucci 

■ II. v* ..I 

,.,.,iM ^cf^n ..«; I trcecv'-iAf f C .TCA.Ui * f~7CZ C .*..'. J :~"7J*TT. 

'..-.-rr: v.rjrr-.*** Z7crrr.cc. .vm a rr *.t .vT.\r.\.\n . - t»; -ut r vi:rr 

. \ rr. ta.> f . r?. 1 . ;»:.*. Tf.T.w rc.r,\ rr.:xr; it '.mvrr.:., t .\ ; ..;.;;r.Trr \tc*1 

\t u:TrCi":.- .■ : ;r »i.".\rTTr.i if t« <;;t*c\.uac aaaa: .. -.:r.:t.e.\:..v : 

Figure I. Primary structure of the CaMV 2nd (he nos polya- 
denylation signal fragments. Uppercase letters show onginai 
sequence horn each signal; lowercase letters are polylinker se- 
quences. The restriction sues are indicated by a thin tine above 
or below the sequence. In the standard construct iR-CATl the 
CaMV poly(A) signal fragment is followed immediately by the 
bos polyiAl signal fragment. The Hi'ndlH site is at the (unction 
of those two fragments and is shown both at the end of the 
CaMV fragment and at the beginning oi the nos fragment. Solid 
boxes above the sequence show the normal processing sites in 
the CaMV and nos fragments. The open box above the nos se- 
quence represents a cryptic processing site {cryptic nos in the 
text). Sequences that may be important for polyadenylation arc 
boxed: three upstream TC-rich sequences and the AATAAA 
sequence in the CaMV fragment. An AATAAA sequence that is 
not recognized in the wild-type nos gene (cryptic AATAAA in 
the text) is enclosed in a dotted box. Solid or open arrows above 
the third TC-rich box in the CaMV sequence show penect and 
imperfect repeats of the sequence TTTGTA. The large numbers 
above the CaMV sequence show the exact end point ot the up- 
s:ream BAl-31 deletions discussed in the text, the smaller 
numbers below the sequence represent the position of relevant 
nucleotides relative to the CaMV processing site. 

1982; Depickcr et al. 1982) downstream of the CaMV 
fragments. The proportion ot transcripts processed at the 
CaMV versus the nos site was used as a measure for the 
efficiency of processing at the CaMV site. Plasmid 
R-CAT was introduced into Nicouana plumbagiaiiolia 
protoplasts using a transient expression system, and 
total RNA was analyzed in an RNase protection assay 
with an antisense probe (PI) covering both polytA) 
signals (see Fig. 2A). The standard plasmid gave rise to 
transcripts cleaved only at the CaMV polytA) addition 
site, which protect 190 nucleotides of the probe (Fig. 3). 

As a first step in the analysis of the CaMV poly(A) 
signal the AATAAA sequence was mutated by oligonu- 
cleotide-dirccted mutagenesis of plasmid R-CAT. The 
precise deletion ot the CaMV AATAAA prevented pro- 
cessing at the CaMV site (Fig. 3, construct AAATAAAI. 
confirming our earlier results (Saniacon and Hohn 1990). 
Surprisingly, the rescued transcripts were not cleaved at 
the previously described nos processing site (Sevan et al. 
' 1982; Depickcr et al. 1982), but at a cryptic site -100 
nucleotides upstream and 20 nucleotides downstream of 



142 



CESES & DEVELOPMENT 





CaMV polriAl t((BJi 



PI-I 



PI 



/ 



nptic nos- 



CaMV: 



— ► 



CAT 




Bal3l inulanl: 



\ Pnos / 



nos — 
ryplic no> • — 



CAT 



C'fiMV inserts 



Figure Z. Strategy lor the RNA mapping. tAI Strategy lor the 
supping 01 RNA$ produced troro the construct R-CAT or mu- 
tant derivatives Boxes represent the CaMV 35S promoter en- 
hancer, the CAT coding region, a fragment containing the 
iZaMV polyi A! signal, and a fragment containing the nos polyjA! 
iignaJ. The thin lines above the construct represent the tran- 
scripts processed ai either the CaMV site, the cryptic nos. or 
ihe normal nos sues \ which are indicated by small arrows at 
boiiom). The two arm sense probes that were used with those 
constructs arc indicated by the upper bold lines. The lower hor- 
izontal arrow indicates the starting point and the direction or 
the BAL-31 deletions. i6*| Strategy for mapping of RNAs pro- 
duced from the construct nos and its derivatives with CaMV 
inserts. Symbols are as in A The site of insertion ot the oligo- 
nucleotides containing specific CaMV sequences is indicated 
by the large solid arrow at the bottom. 

an AATAAA sequence that is not recognized in the 
wild-type nos gene iFig. 1 ; Bevan ct al. 1982; Depicker et 
al. 19821. 

AT-(j point mutation of the AATAAA motif was 
recognized at half or the wild- type efficiency iFig. 3, con- 
struct AAGAAA1 Fifty-five percent of the transcripts 
were processed at the CaMV site and most oi the re- 
maining transcripts at the cryptic nos site 135% I. It is 
notcworthv that in vertebrate in vivo (Montell et al. 
1983) and in vitro |Wilusz et al. 1989) systems, a similar 
mutation would reduce cleavage by more than 95%. 

These results suggest that the AATAAA sequence is 
required lor efficient 3 -end processing in CaMV. Fur- 
thermore, additional sequences in the CaMV polyjA} 
signal are likely to oe needed. These sequences probably 
allowed recognition of the crypuc AATAAA in the nos 
signal in absence oj the CaMV AATAAA motif. 

Sequences immediately downstream of the processing 
site influence the position oi the CaMV cleavage site 

A stretch ot only 15 nucleotides of original CaMV se- 
quence downstream of the processing site axe included 
in plasmid R-CAT isee Fig. I). Addition of the following. 
200 nucleotides farther downstream, isolated from the 



CaMV leader, had no effect on the efficiency of pro- 
cessing at the CaMV site (not shown). Therefore, if an 
important element downstream of the CaMV processing 
site exists, as in vertebrate genes, it must be located 
within those 15 nucleotides. To test this, this sequence 
was deleted precisely IFig. 4, construct ADo) with the 
exception of an A residue close to the processing site, 
which would be of importance in vertebrate genes {Fitz- 
gerald and Shenk 1981; Mason ct al. 1986). Approxi- 
mately 60% of the RNA molecules produced from this 
construct were cleaved at the CaMV site. The remainder 
were processed at either the cryptic nos site or at a new 
site upstream of it (Fig. 5). The slight decrease of pro- 
cessing at the CaMV site could be explained by two pos- 
sibilities. We might have either deleted a positively 
acting element (e.g., an AC-rich regionl or created a neg- 
atively acting element (e.g., by bringing the polylinker 
closer to the processing sue). 

To distinguish between these two possibilities, we 
further deleted most of the polylinker m either the stan- 
dard plasmid (construct ASst) or in the construct ADo 
(construct ADoASst, ADoAKpnl, and ADoAKpn2). More 
than 95% of the RNA molecules produced from con- 
structs ASst, ADoASst, and ADoAKpnl were processed 
around the CaMV site |Fig. 4). However, the 3' ends of 
those RNAs showed more heterogeneity than RNAs 
produced from the standard plasmid (see Fig. 5). RNAs 
produced from construct ADoAKpnl also showed 3'-end 



- < i 
< < s 



IS < 

317- 



396 - 



}*4- 



• • • 



- probe 



:20 - 



»• - CjSIV 



154- 

Figure 3. Protection assay ot RNAs produced from the AA- 
TAAA mutants with probe PI. Lane MW indicates the position 
of relative molecular weight markers i^P-labeled single-strand 
DNA fragments). For each construct the sense- anosense hy- 
brids are shown before ( - ) or after t + 1 digestion with RNase A 
and Tl. Anuscnse probes used were the exact mutant deriva- 
tives oi probe PI {Fig. 2). Expected sizes or transcripts processed 
.at the CaMV (Ml, the cryptic I Al, and the normal (•) nos sites 
are shown for each lane. 



rrt 

O 
O 



GENES & DEVELOPMENT 143 



S*>nfecon et il. 



; 7 j i i s t Z 
..• '*' " ~ " \ 5 ~ 

\|tl ! 
» - — • — ♦■— -» <— + p » j 

I . 

iH 

; i 

MM • » ~ 

crypt:c 
no* 

»• » — 

t 

. .V 

•• »f ^ «f «| « f - CaMV 



Figure 4. Itotcttion assay with RNAs produced ham mutants 
in the downstream region with probe PI. IMWI Relative molec- 
ular weight markers i» 2 P-labeled single stranded ON A Ira*- 
mcntsS. Hvbrids beiorc : - and aitcr \ *) digestion are shown. 
The amisense probe* u»cd are mutant derivatives or probe PI. 
The expected sues tor transcripts processed at the CaMV [■■ nr 
ji the crvpuc rw» Ai sties arc indicated tor each lane. 



heterogencitv at the CaMV site and some readthrough to 
the crypuc nm Mtc i!2% oi the tTanscriptsl. Taken to- 
gether, these results suggest that in contrast to must re- 
ported higher eukaryotic systems, the primary sequence 
downstream or the CaMV processing site does not con* 
tain elements necessary tor i'-end formation, although 
the primary or the secondary structure in the immediate 
surrounding or the polyiAl site might influence the pre 
cisc positioning oi cleavage. 

Sequences upstream u/ AATAAA arc important tor 
CaMV RXA Vsnti formation 

To localise sequences important lor processing we pro- 
gressively deleted the sequences upstream from the AA- 
TAAA hy BAL-3 1 digestion rrom the 5' end of the CaMV 
polyiAl signal fragment toward the AATAAA sequence 
{Fig. 2 A). The mutants obtained were named according 
to the number oi nucleotides remaining upstream or the 
processing site ias shown in Fig. 1 1. The sequence at the 
junction between the CAT gene and the truncated 
CaMV signal varied slightly from one clone to another, 
as indicated in Materials and methods. RNAs were ana- 
lyzed using probe PI Isce Fig. 21. The length of the pro- 
tected fragments lor transcripts processed at the CaMV, 
the cryptic, and the normal nos poly(A) addition sue de- 
creases with increasing deletion |Fig. 6A|. Transcripts 



processed at the CaMV site in mutants with the largest 
deletions (construct A75-A32) would be very small and 
therefore difficult to quantitate with our procedure. To 
increase the size of the protected antisense fragments, 
we therefore used homologous probes (probe senes PI* 1J 
for each mutant complementary to the end of the CAT 
gene, the mutated CaMV signal, and the rescuing nos 
signal {Figs. 2A and 6B). 

On progressive deletion of upstream sequences, the 
efficiency of cleavage at the CaMV poly(A) addition site 
is reduced. The sequences upstream of nucleotide 67, 
which include two TG-rich motifs (Fig. 11, are likely to 
play only an auxiliary role in mRNA 3'-end formation 
since 60% of the transcripts produced from construct 
d67 were processed at the CaMV site (Fig. 61. In partic- 
ular, deletion or the second TC-rich sequence 70 nucleo- 
tides upstream of the processing site did not affect pro- 
cessing efficiency jcf. constructs A75 and A67, Fig. 6B|. 
Interestingly, the transcripts that bypassed the CaMV 
signal are mainly processed at the cryptic rather than at 
the normal nos site. Deletion of an additional 24 nucleo- 
tides reduced processing at the CaMV site to 46% (con- 
struct A44|. The transcripts that read through the CaMV 
signal were processed at cither the cryptic or the normal 
aos site. Finally, deletion of the subsequent 12 nucleo- 
tides allowed only very limited cleavage at the CaMV 
site (8%, construct A32). Most of the remaining tran* 
scripts in this case were processed at the normal nos 
site. These results suggest the involvement of at least 
one and perhaps several upstream elements in the recog- 
nition of AATAAA. cither CaMV or nos. The most im- 
portant clement spans the region between 67 and 32 nu- 
cleotides upstream of the CaMV processing site (Fig. II. 



• ■ l.MtUlh .1 

'■• • • .' ■•,.■'» «■ . . • lit. • ^ i i i • fc.ii 1 ; . .', '.1 ill":... 



Figure 5. Primary structure ot mutants in the downstream re- 
gion. Uppercase letters represent the original CaMV or nos se- 
quences: lowercase letters represent polyhnker sequences. The 
wild-type CaMV processing site is shown by the solid box 
above each sequence. Locauon of additional processing sites in 
the downstream mutants was evaluated according to the sue or' 
the protected iragmcnt and is indicated by the open boxes. 
Numbers above those boxes represent the percentage of (ran- 
senpts cleaved at the corresponding site. The cryptic AATAAA 
in the nos signal is enclosed in a dotted box. 



144 



GENES 6l flFVELOPMENT 




• — 



D 



t 



processing R-CU 1H9 :i!9 ^99 675 u67 au &32 «lS6-<6 
■ CaMV 07 1)7 ~G &0 60 00 10 S 97 

• no» t l t II II 14 29 TO I 



Figure ft. Piutection assays with RNAs produced trora muunts in the upstream CaMV region. (A) Analysis o: the BAL-31 deletion 
mutants with probe PI. IMWI Molecular weight markers. Hybrids before 1 - 1 and after 1 + 1 RNase digestion are shown. The expected 
sizes or transenpts processed at the CaMV \U\, the cryptic (At, or the normal (•! nos sites arc shown for each lane. [B\ Analysis of the 
four largest BAL-31 deletion mutants with probe PH. For each mutant, the Pl-1 probe used is a derivative of the wild-type PM with 
the corresponding junction between the CAT gene and the deleted CaMV signal. As in A, expected sizes of transenpts processed at 
each site arc indicated. iCl Analysis of the precise deletion mutant A56— *6 using probe PI. For analysis of the mutant, a derivative of 
probe PI containing the same mutation tn the CaMV signal was used. iDI Percentage of transcripts processed at each site. Numbers 
presented are a compilation of several experiments obtained with either probe PI or probe PM. 



A precise deletion of the sequence CCCTTAGTATC 
situated 56-46 nucleotides upstream of the processing 
site (construct A56-46I did not affect processing at the 
CaMV site |Fig. 6CI. suggesting that the upstream ele- 
mcnt mentioned above might be restricted to 45 nucleo- 
tides upstream of the processing site. These sequences 
include penect repeats of the motif TATTTCTA. 

To determine which or the CaMV sequences induced 
recognition of the normally silent nos AATAAA in sev- 
eral of our deletion mutants, we inserted two types of 
oligonucleotides containing part of the CaMV upstream 
sequence, tn either orientation, 35 nucleotides upstream 
of the cryptic and 150 nucleotides upstream of the 
normal nos poly(Al addition site in plasmid nos (Fig. 2D1. 
Oligonucleotide 53-32, which includes sequences 
53-32 nucleotides upstream of the CaMV processing 
site i TT ACT ATCT ATTTGTATTTCT A I. and oligonu- 
cleotide 7 6-7G\ which includes sequences 76-70 nu- 
cleotides upstream of the CaMV processing site 
TCTCTTC1. were chosen (Fig. 1 ; see also Materials and 
methods! The RNAs produced from :hese constructs 



were analyzed using homologous probes Pnos, which 
covered for each mutant the nos fragment, the intro- 
duced oligonucleotide, and the end of the CAT gene IFig. 
2BI. In the parent plasmid, processing occurred only at 
the normal nos site (Fig. 7, construct nos). Introduction 
of oligonucleotide 53-32 induced the recognition of the 
cryptic AATAAA, since 30% of the transcripts were 
cleaved at the cryptic site (Fig. 7, construct nos +(53- 
32)|. This element had no effect when placed in the re- 
verse orientation (Fig. 7, construct nos + (32- S3)]- The 
sequences contained in oligonucleotide 76-70 did not 
induce the AATAAA recognition in either orientation. 
The active oligonucleotide (53-321 includes tandem re- 
peats of the TATTTGTA motif mentioned above. 

Discussion 

We have defined several elements involved in the 3'-cnd 
formation of CaMV RNA. Interestingly, the require* 
ments for this plant poly(A) signal seem to differ from 
what is known for vertebrate systems. To our knowl- 



m 

CO 



CD 

m 

O 
O 



GENES & DEVELOPMENT 145 





Siofacon rt al 




:08- 



4 J 



not 



j cryptic 
i noi 



Figure 7. Protection assay, with probe Pnos, or RNAs pro- 
duced from the CaMV oligonucleotide insertions upstream ot 
the nos polviA! signal. The mapping strategy is shown in Fig. 
2B. |MW) Molecular weight markers: | - and + i hybrids before 
or after RNase digestion. Expected size of protected fragments 
corresponding to transcripts processed at the normal (•! and 
(he cryptic |A' not Mies arc shown for each lane. 

edge, the results presented here and earlier iSanracon 
and Hohn 1990) provide the iirst ducct evidence that an 
AATAAA sequence is important for mRNA 3' -end for- 
mation in plant cells We also show that, in contrast to 
vertebrate systems, a point mutation in this sequence is 
partly recognized. It is noteworthy that in vertebrate m 
vivo (Montell et al 19831 and in vitro iWilusz et al. 19891 
systems, a similar mutation would reduce cleavage by 
>95%. Apparently, the stringency tor the AATAAA 
consensus sequence is much lower than in vertebrate 
systems. This is in agreement with the previous obser- 
vation that a perrcci consensus AATAAA sequence up- 
stream of the processing sue is present in only 40% ot 
known plant genes iJoshi 19871. The normal nos pro- 
cessing sue, tor instance, is not preceded by a perfect 
consensus AATAAA (Fig. 1 ; Bevan ct al. 1982; Depicker 
ct al. 19821. It is very likely that sequences other than 
AATAAA play an important role and can improve the 
recognition ot less conserved AATAAA in plant genes. 

The sequences downstream of the CaMV polyjAl site 
do not influence processing efficiency drastically, al- 
though they aifect the precision of cleavage. Similar pre- 
cision effects were observed with vertebrate systems 
using minor mutations in the downstream elements 
IWoychick et al. 1984, Mason et al. 1986}. The heteroge- 
neity of processing observed with some of our mutants 
can be due to either the primary or the secondary struc- 
ture oi sequences surrounding the cleavage sue. The 



fragment containing the wild- type CaMV and nos 
poly(A) signals can be folded into a potential structure 
consisting of several stems and loops. The processing 
site is situated just upstream of an AC-rich region that is 
relatively unstructured in our model. 

In contrast to CaMV, TG-rich regions are found down- 
stream of the cleavage site of vertebrate genes and are 
essential for formation of the polyadenylation com- 
plexes and for efficiency of the cleavage and polyadenyl- 
ation reactions (for review, see Humphrey and Proud- 
foot 19881. in vitro, the recognition of a normally silent 
AATAAA was induced on downstream insertion of an 
oligonucleotide containing the TG-rich sequence fRyner 
et al. 1989). In plant systems, deletion of TG-rich se- 
quences downstream of the poly(A| site of the octopine 
synthetase (Ingelbrccht et al. 19891 and the potato 
wound-mduciblc proteinase inhibitor U (An et al. 19891 
genes reduced steady-state RNA level or reporter gene 
activity. The latter results would, however, not allow 
one to distinguish between effects at the level of 
cleavage and polyadenylation efficiency, of differences 
in mRNA stability, or of transport efficiency. Work with 
additional plant poly(A) signals will give information as 
to whether the absence of a positively acting down- 
stream clement (such as a T-ricb or TG-rich element) is 
a feature of more plant poly(A) signals or a peculiarity 
of the CaMV. 

We could show that deletion oi sequences upstream of 
CaMV AATAAA inhibits processing at the CaMV site. 
Furthermore, the recognition of a normally silent AA- 
TAAA was induced on insertion upstream of oligonu- 
cleotide 53-32, which includes the sequence TTAC- 
TATGTATTTGTATTTGTA (Fig. 7). The induced rec- 
ognition of this cryptic AATAAA was, however, partial 
(33% of the transcripts,- Fig. 7). One possible explanation 
of this partial recognition is that we introduced only one 
of several CaMV upstream elements, which perhaps 
normally act in an additive fashion. This would be con- 
sistent with the results obtained with mutant 167, 
which contained the whole CaMV sequence of oligonu- 
cleotide 53-32 but showed only partial cleavage at the 
CaMV site (60% of the transcripts). Alternatively, the 
positive effect of the upstream element may be depen- 
dent on us position relative to AATAAA. In wild-type 
CaMV, this element is situated 14 nucleotides upstream 
of AATAAA, while in the artificial situation it is 30 nu- 
cleotides upstream of the cryptic nos AATAAA. Such a 
position-dependent action of downstream elements has 
been described in vertebrate cells (Gil and Proudfoot 
1987; Heath et al. 19901. 

Because the oligonucleotide 53-32 alone has a posi- 
tive effect on insertion into a heterologous signal, it is 
likely that its primary structure ratheT than the general 
secondary structure of the CaMV upstream region is im- 
portant. The CaMV element contained in this oligonu- 
cleotide has homologies with proposed consensus se- 
quences in vertebrate poly(A) signals (TG-rich down- 
stream element) and with a proposed consensus 
sequence TGTCTTT found downstream of some plant- 
processing sues (loshi 1987). The other tested sequence, 



rn 

O 
O 



146 



GENES & DEVELOPMENT 



CaM V polytA) tipul 



TCTCTTC. which is a single, short TG-rich clement, 
did not have any influence on either CaMV or cryptic 
nos AATAAA recognition. It is possible that the 53-32 
upstream sequence is active because of the presence of 
several repeats oi a short TG-rich sequence (such as 
TTTCTA or TGTATTT; see Fig. 1|. Similarly, it was 
shown that the downstream element of the rabbit 0- 
globin gene consisted of several smaller elements (Gil 
and Proudfoot 19871. It is also possible that the impor- 
tant feature of the 53-32 sequence is not simply TG 
richness. It is interesting that this sequence also shows 
homologies to a yeast terminator consensus sequence 
iTAG TACT . . . TTT; Zaret and Sherman 1982). 
However, this homology is perhaps not relevant since a 
precise deletion of the first part of this sequence in the 
CaMV signal allowed processing at wild-type efficiency 
(Fig. 6, construct A56-46). Furthermore, more recent ex- 
periments suggest that this sequence might not be a key 
clement in yeast mRNA 3'-end processing (Osborne and 
Cuarcntc 19891. Point mutation analysis of CaMV up* 
stream elemental is under way, which will allow us to 
determine the important features of this region. 

The involvement of sequences upstream of AATAAA 
has been described in only a few systems. In plants, the 
analysis of the pea nbulose*l'5*biphosphate carboxylase 
small-subumt gene suggested the involvement of several 
upstream and downstream elements (Hunt and Mac- 
Donald 19891 In vertebrate systems, it has been shown 
that an AATAAA and a downstream element are the 
only necessary requirements for efficient processing and 
that the polyadenylation site is usually at an A residue 
(Kessler ct al. 1986; Levitt et al. 19891. However, up- 
stream elements in addition to the downstream element 
have been described for the SV40 late poly(A) signal 
Carswcll and Alwine 1989), for the adenovirus late 
transcription unit (DeZasso and Im penal e 1989), and for 
the hepatitis B virus (Russnak and Cancro 1990). In most 
cases, the upstream element induces the recognition of 
weak polyi Al signals, thereby allowing a aght regulation 
ot mRNA .V end tormation. In the case of the hepatitis B 
virus, a pararetrovirus such as CaMV, part of this ele- 
ment is upstream of the transcription initiation site and 
allows the production of terminally redundant RNA. In 
the case nf CaMV, the conditional recognition of the 
poiyiAl signal is dependent on promoter proximity |San- 
facon and Hohn 1990) and not on the upstream element, 
or elements, which is located downstream of the tran- 
scription initiation site. A possible regulatory role for an 
upstream element in CaMV remains unclear. Since the 
element compnscd in oligonucleotide 53 -32 stimulates 
AATAAA recognition in only one orientation, it is 
tempting to speculate that it acts at the RNA level, al- 
though we cannot rule out the possibility that it acts at 
the DNA level. It is possible that this upstream element 
is involved in the formation of a polyadenylation com- 
plex and compensates for the absence of a downstream 
element. Whether the involvement of upstream ele- 
ments is a particular feature of a few poly(A) signals (e.g., 
those onginating from viruses or from plants! or a more 
general feature remains to be seen. 



Materials and methods 
Plasmid constructions 

Coostrucuon of the plasmid R-CAT was described previously 
ISanjacon and Hons 19901- Plasmids AAATAAA. AAGAAA, 
ADo. and A56-46 were obtained by oligonucleotide -directed 
mutagenesis of plasmid R-CAT. P las raids ASst and ADodSst 
were obtained by cutting plasnuds R-CAT and AOo, respec- 
tively, with Sstl and religaung the large isolated fragment. 
Plasnuds ADoAKpnl and ADoAKpnl were obtained serendipi- 
tously by cutting plasmid ADo with Kpnl and HindlB and reli- 
gating; the relevant sequences are shown in Figure 5. Plasmids 
A149-&32 were obtained by digestion of plasmid R-CAT with 
Psr! between the CAT gene and the CaMV polyi A) signal frag- 
ment. The linearized fragment was digested further with 
BAL-31 and ligated to either an Xhol or a Pstl linker. The mix- 
ture was digested with either Xhol-HindUl or Psr I -Hin duT and 
the small fragment containing the deleted CaMV polyi Al signal 
was isolated and religated with the large Pstl-HindUl fragment 
of plasmid R-CAT. The resulting sequences between position 
- 196 (Fig. 1) and the dclcuon point are as follows: plasmids 
Al49-d75, CAACCTTCTCCACCi plasmid A67, CAACCTC; 
plasmid .144. CAACCTCC; and plasmid GA- 
ACCTCCTCCACC. Plasmid nos was constructed by ligaiing 
the small Pstl-Hind\\\ fragment of pNOSCAT (Fromm et al. 
19851 containing the nos poiyiAl signal into the large Pstl- 
Wmdlil fragment of pDW2 (Pietrzak et al. 1986). Plasmid 
nos +(53-32) was constructed by inserting oligonucleotide 
53-32: 5' -CTCCACACTCG ACTTACTATCTATTTCTATT- 
TCTAGTCC AC ACTC CAG-3 ' into the Pstl site of plasmid 
nos. This oligonucleotide contains the CaMV sequence 53-32 
nucleoudes upstream of the processing site surrounded by poly- 
linker sequences: a Pstl and a Xhol site on one extremity 
and a Pstl and a Sail site on the other extremity. Plasmid 
nos t|32 - 531 was obtained by cloning die same oligonucleotide 
in the reverse oneniauon. Plasmid nos + i76-70l and nos +(70- 
76) were built in a similar manner using oligonucleotide 
70-76 5'-CTGCAC ACTCG AGTCTCTTGGTCG ACACTC • 
CAG-3', which includes the relevant CaMV sequence sur- 
rounded by the same poly linker sequences. Probe PI has been 
described ISanjacon and Hohn 1990). For each mutant a deriva- 
tive of this probe was constructed by inserting the small Pstl- 
£coRI fragment containing the mutated CaMV polyiA) signal 
and the wild-type nos polyi A| signal into probe PI. Probes Pl-1 
were obtained for each mutant by transferring the corre- 
sponding fl<un HI -WindlH fragment containing the CAT gene 
and the CaMV polyi A) signal into the /VuU-Hindlfl fragment 
ot probe PI. Similarly, probe Pnos was obtained by transferring 
the 3amH\-HindiU fragment of plasmid nos containing the 
CAT gene and the nos poiyiAl signal into the PvuH-HindlU 
fragment oi probe PI. Derivatives of probe Pnos were obtained 
for the oligonucleotide insertion mutants. 

Protoplast transiectian and RNA analysis 

,Y plumbagtnitolia protoplasts were uansfected. and total RNA 
was punned after 6 hr and mapped by RNase protection assay 
as described previously (Vankan et al. 1988} Goodall and Fiiipo- 
wicz 1989). For each of the mutants tested a specific anusense 
probe containing the same mutation was used. Probes PI were 
digested with Pstl pnor to in vitro transcription, and probes PM 
and Pnos were digested with Seal (site located at the end of the 
CAT genel. For the experiments using probe Pnos and deriva- 
tives i Fig. 71 the sense -anusense hybrids were digested by 
RNase Tl only. This procedure avoided the appearance of sec- 
ondary bands due to breathing of the hybrids in a region very 



GENES & DEVELOPMENT 147 



